This article discusses our research on polyphonic music transcription using non-negative matrix factorisation (NMF).\r\nThe application of NMF in polyphonic transcription offers an alternative approach in which observed frequency\r\nspectra from polyphonic audio could be seen as an aggregation of spectra from monophonic components.\r\nHowever, it is not easy to find accurate aggregations using a standard NMF procedure since there are many ways\r\nto satisfy the factoring of V Ã?Å? WH. Three limitations associated with the application of standard NMF to factor\r\nfrequency spectra are (i) the permutation of transcription output; (ii) the unknown factoring r; and (iii) the factoring W\r\nand H that have a tendency to be trapped in a sub-optimal solution. This work explores the uses of the heuristics\r\nthat exploit the harmonic information of each pitch to tackle these limitations. In our implementation, this\r\nharmonic information is learned from the training data consisting of the pitches from a desired instrument, while\r\nthe unknown effective r is approximated from the correlation between the input signal and the training data. This\r\napproach offers an effective exploitation of the domain knowledge. The empirical results show that the proposed\r\napproach could significantly improve the accuracy of the transcription output as compared to the standard NMF\r\napproach.
Loading....